From Submit to Submitted via Submission: On Lexical Rules in Large-Scale Lexicon Acquisition

نویسندگان

  • Evelyne Viegas
  • Boyan A. Onyshkevych
  • Victor Raskin
  • Sergei Nirenburg
چکیده

This paper deals with the discovery, representation, and use of lexical rules (LRs) during large-scale semi-automatic computational lexicon acquisition. The analysis is based on a set of LRs implemented and tested on the basis of Spanish and English businessand finance-related corpora. We show that, though the use of LRs is justified, they do not come costfree. Semi-automatic output checking is required, even with blocking and preemtion procedures built in. Nevertheless, largescope LRs are justified because they facilitate the unavoidable process of large-scale semi-automatic lexical acquisition. We also argue that the place of LRs in the computational process is a complex issue. 1 I n t r o d u c t i o n This paper deals with the discovery, representation, and use of lexical rules (LRs) in the process of largescale semi-automatic computational lexicon acquisition. LRs are viewed as a means to minimize the need for costly lexicographic heuristics, to reduce the number of lexicon entry types, and generally to make the acquisition process faster and cheaper. The findings reported here have been implemented and tested on the basis of Spanish and English businessand finance-related corpora. The central idea of our approach that there are systematic paradigmatic meaning relations between lexical items, such that, given an entry for one such item, other entries can be derived automa t i ca l l y is certainly not novel. In modern times, it has been reintroduced into linguistic discourse by the Meaning-Text group in their work on lexical functions (see, for instance, (Mel'~uk, 1979). § also of US Department of Defense, Attn R525, Fort Meade, MD 20755, USA and Carnegie Mellon University, Pittsburgh, PA. USA. §§ also of Purdue University NLP Lab, W Lafayette, IN 47907, USA. It has been lately incorporated into computational lexicography in (Atkins, 1991), (Ostler and Atkins, 1992), (Briscoe and Copestake, 1991), (Copestake and Briscoe, 1992), (Briscoe et al., 1993)). Pustejovsky (Pustejovsky, 1991, 1995) has coined an attractive term to capture these phenomena: one of the declared objectives of his 'generative lexicon' is a departure from sense enumeration to sense derivation with the help of lexical rules. The generative lexicon provides a useful framework for potentially infinite sense modulation in specific contexts (cf. (Leech, 1981), (Cruse, 1986)), due to type coercion (e.g., (eustejovsky, 1993)) and similar phenomena. Most LRs in the generative lexicon approach, however, have been proposed for small classes of words and explain such grammatical and semantic shifts as +coun t to c o u n t or c o m m o n to + c o m m o n . While shifts and modulations are important , we find that the main significance of LRs is their promise to aid the task of m ass i ve lexical acquisition. Section 2 below outlines the nature of LRs in our approach and their status in the computational process. Section 3 presents a fully implemented case study, the morpho-semantic LRs. Section 4 briefly reviews the cost factors associated with LRs; the argument in it is based on another case study, the adjective-related LRs, which is especialy instructive since it may mislead one into thinking thai. LRs are unconditionally beneficial. 2 N a t u r e o f L e x i c a l R u l e s 2.1 O n t o l o g i c a l S e m a n t i c B a c k g r o u n d Our approach to NLP can be characterized as ontology-driven semantics (see, e.g., (Nirenburg and Levin, 1992)). The lexicon for which our LRs are introduced is intended to support the computational specification and use of text meaning representations. The lexical entries are quite complex, as they must contain many different types of lexical knowledge that may be used by specialist processes for automatic text analysis or generation (see, e.g.,

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Spanish Lexical Acquisition via Morpho-Semantic Constructive Derivational Morphology

This paper describes an algorithm for Spanish derivational morphology whose output is generalizable to two different lexicon acquisition situations. One is the process of automatic lexicon acquisition via the use of Morpho-Semantic Lexical Rules (MSLRs), (Viegas, Gonzalez, & Longwell 1996) usable in semantically based Natural Language Processing(Nirenburg, et al 1996) in order to considerably r...

متن کامل

The Effect of Lexicon-based Debates on the Felicity of Lexical Equivalents in Translating Literary Texts by Iranian EFL Learners

This study was an attempt to investigate the effect of lexicon-based debates on the felicity of lexical equivalents in translating literary texts by Iranian EFL learners.  To fulfill the purpose of this study, 59 university students, majoring in English Translation, were randomly assigned to the experimental and control groups from a total of 73 students based on their performance on a mock TOE...

متن کامل

Enriching Morphological Lexica through Unsupervised Derivational Rule Acquisition

In a morphological lexicon, each entry combines a lemma with a specific inflection class, often defined by a set of inflection rules. Therefore, such lexica usually give a satisfying account of inflectional operations. Derivational information, however, is usually badly covered. In this paper we introduce a novel approach for enriching morphological lexica with derivational links between entrie...

متن کامل

Breadth And Depth Of Semantic Lexicons - Notes On The Workshop

Whereas the former has been regarded as a topical issue for quite some time, the latter is only now receiving its due attention. This workshop will concentrate on lexical rules as a regulator of breadth and depth of the lexicons. Lexical rules are known under a variety of names, e.g., Leech's (1981) "semantic transfer rules," "lexical implication rules" of Ostler and Atkins (1991) and others. T...

متن کامل

Multiword Lexical Acquisition And Dictionary Formalization

In this paper, we present the current state of development of a large-scale lexicon built at LabEL1 for Portuguese. We will concentrate on multiword expressions (MWE), particularly on multiword nouns, (i) illustrating their most relevant morphological features, and (ii) pointing out the methods and techniques adopted to generate the inflected forms from lemmas. Moreover, we describe a corpus-ba...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1996